Emotion conversion using F0 segment selection
نویسندگان
چکیده
This paper describes F0 segment selection, a novel syllablebased F0 conversion method, which provides a concatenative framework to search for F0 segments in a modest corpus of emotional speech (∼15 minutes of data). The method is compared with our earlier work on F0 generation using contextsensitive syllable HMMs. Both methods are complemented with a duration conversion module as well as GMM-based spectral conversion to form a unified emotion conversion framework in English. The system was evaluated using three target styles: surprise, anger and sadness. The results of an extensive perceptual test show that segment selection significantly outperforms the HMM-based method in terms of both emotion recognition rates and intonation quality ratings for surprise and anger. For conveying sadness both methods were effective.
منابع مشابه
Data-driven emotion conversion in spoken English
This paper describes an emotion conversion system that combines independent parameter transformation techniques to endow a neutral utterance with a desired target emotion. A set of prosody conversion methods have been developed which utilise a small amount of expressive training data ( 15 min) and which have been evaluated for three target emotions: anger, surprise and sadness. The system perfo...
متن کاملA system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality
This paper describes a system that combines independent transformation techniques to endow a neutral utterance with some required target emotion. The system consists of three modules that are each trained on a limited amount of speech data and act on differing temporal layers. F0 contours are modelled and generated using context-sensitive syllable HMMs, while durations are transformed using pho...
متن کاملInvestigation on Pleasure Related Acoustic Features of Affective Speech*
This paper presents our recent work on the investigation of affective speech along the pleasure-displeasure (P) dimension in the PAD emotional space. First, 76 conventional features are taken as candidates. And a novel feature, F0 Dominant Ratio, which is designed to reflect the dominant pitch, is also introduced. Correlative coefficients are calculated and a preliminary selection is done. Fact...
متن کاملEmotional Voice Conversion for Mandarin using Tone Nucleus Model – Small Corpus and High Efficiency
The GMM-based spectral conversion techniques were applied to emotion conversion but it was found that spectral transformation alone is not sufficient for conveying the required target emotion. In this paper, we adopt the tone nucleus model to carry the most important information of tones and represent F0 contour for Mandarin speech. And then tone nucleus part is converted to emotional speech fr...
متن کاملA Unit Selection Approach to F0 Modeling and Its Application to Emphasis
This paper presents a new unit selection approach to F0 modeling for speech synthesis. We construct the F0 contour of an utterance by selecting portions of contours from a recorded speech database. In this approach, the elementary unit is the segment, which gives the system flexibility to combine segments from different phrases and model both macroprosody and microprosody. This method was imple...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008